Engineering Production of a Novel Diterpene Synthase Precursor in Nicotiana benthamiana

Diterpene biosynthesis commonly originates with the methylerythritol phosphate (MEP) pathway in chloroplasts, leading to the C20 substrate, geranylgeranyl pyrophosphate (GGPP). The previous work demonstrated that over-expression of genes responsible for the first and last steps in the MEP pathway in combination with GERANYLGERANYL PYROPHOSPHATE SYNTHASE (GGPPS) and CASBENE SYNTHASE (CAS) is optimal for increasing flux through to casbene in Nicotiana benthamiana. When the gene responsible for the last step in the MEP pathway, 4-HYDROXY-3-METHYLBUT-2-ENYL DIPHOSPHATE REDUCTASE (HDR), is removed from this combination, casbene is still produced but at lower amounts. Here, we report the unexpected finding that this reduced gene combination also results in the production of 16-hydroxy-casbene (16-OH-casbene), consistent with the presence of 16-hydroxy-geranylgeranyl phosphate (16-OH-GGPP) in the same material. Indirect evidence suggests the latter is formed as a result of elevated levels of 4-hydroxy-3-methyl-but-2-enyl pyrophosphate (HMBPP) caused by a bottleneck at the HDR step responsible for conversion of HMBPP to dimethylallyl pyrophosphate (DMAPP). Over-expression of a GERANYLLINALOOL SYNTHASE from Nicotiana attenuata (NaGLS) produces 16-hydroxy-geranyllinalool (16-OH-geranyllinalool) when transiently expressed with the same reduced combination of MEP pathway genes in N. benthamiana. This work highlights the importance of pathway flux control in metabolic pathway engineering and the possibility of increasing terpene diversity through synthetic biology.

We recently developed a Nicotiana benthamiana platform optimized for production of casbene and derivatives by engineering flux through the MEP pathway (Forestier et al., 2021). We demonstrated how this platform could be used for production of the lathyrane jolkinol C by introduction of functionally characterized P450 oxidases (Forestier et al., 2021). The elucidation of the biosynthetic steps from casbene to the tigliane (Kulkosky et al., 2001;Kissin and Szallasi, 2011;De Ridder et al., 2021), ingenane (Siller et al., 2010) and jatrophane (Corea et al., 2009;Hadi et al., 2013) classes of diterpenoids has also been investigated but remains to be resolved.
In our work on N. benthamiana to optimize flux through the MEP pathway to GGPP -the substrate for casbene production -we unexpectedly detected the novel compound, 16-hydroxycasbene. The design of our experiments suggested that this metabolite did not arise from a hydroxylation downstream of GGPP and we therefore hypothesized that 16-hydroxy-casbene could derive from an alternative substrate. Herein, we present results leading us to conclude that 16-hydroxy-GGPP can act as a novel precursor for diterpene biosynthesis.

Isolation and Quantification of Diterpenoids, GGPP, and OH-GGPP
We detected both casbene and 16-OH-casbene in transiently expressed plants by extracting around 200 mg of dry material with 5 ml of hexane containing 100 μg/ml of β-caryophyllene, then sonicating for 15 min. We quantified the compounds by GC-MS as detailed in Forestier et al. (2021). For geranyllinalool and its derivatives, we extracted around 150 mg of dry weigh (DW) of infiltrated tobacco with 1 ml of ethyl acetate containing 100 mg/L of β-caryophyllene. The samples were shaken overnight at 2000 rpm on a IKA Vibrax VXR basic shaker and then centrifuged, and 100 μl of the supernatant was used directly for GC-MS.
We isolated GGPP and 16-OH-GGPP by adapting the protocol described by Nagel et al. (2014). Approximately 750 mg of ground dry material was extracted with 15 ml of methanol/H 2 O (7:3, v/v) and sonicated for 30 min. We then added 5 ml of water to the mixture, centrifuged for 3 min at 2000 g, and filtered through Whatman filter paper grade 1 and cotton. The cleared extracts were passed through Chromabond HX RA columns and pre-conditioned with 5 ml of methanol and 5 ml of water, and compounds were eluted with 3 ml of ammonium formate 1 M in methanol. Each eluate was dried under a stream of nitrogen and re-dissolved in 250 μl of water/methanol (1:1). We transferred 100 μl into glass HPLC vials, and 2 μl aliquots were analyzed by LC-MS as described by Catania et al. (2018). Additional highresolution mass spectral data were obtained on a parallel LC interfaced to a Thermo Orbitrap Fusion mass spectrometer, operating in ESI mode at 500000 (FWHM) resolution for MS1 data, with MS2 data collected at 120000 resolution using stepped collision energies between 20 and 60 units in both HCD and CID modes.

Accumulation and Purification of Compounds for NMR Spectroscopy
To identify 16-OH-casbene, we ground 4.9 g of dry material obtained from 10 full-grown plants infiltrated with DXS, GGPPS, and CAS, and extracted with 100 ml of hexane. After 1 h of sonication and 2 days shaking, the extract was centrifuged for 3 min at 2000 g, filtered through Whatman paper grade 1 and cotton, and evaporated to obtain 350 mg of oily residue. The residue was re-suspended in 10 ml of hexane/ethyl acetate (70:30, v/v) and purified through a 40 g Buchi silica column on a PuriFlash ® 4,250 system (Interchim). We used the same method of flash chromatography as described in King et al. (2014) to fractionate the extract into 80 samples. GC-MS was used to identify the fraction containing our compound of interest, and 2.6 mg of this was obtained after evaporation, at sufficient purity for direct 1 H NMR analysis on a Bruker AVIII 700 MHz instrument equipped with a cryoprobe.
For 16-OH-geranyllinalool, we infiltrated 40 young plants with DXS, GGPPS, and NaGLS, which provided 10 g of dry material after freeze-drying and grinding. We extracted this with 150 ml of ethyl acetate and left to shake for 5 days on a rotary shaker. After centrifugation and filtration as detailed above, we reduced the volume down to 1 ml before re-suspending in 9 ml of hexane/ethyl acetate and purifying with the same column and method as described above. The fractions of interest were combined and dried to obtain 16.4 mg of extract that was further purified with a reverse phase column [C18-HQ 5 μm 250 mm × 10 mm (Interchim)] to remove the pigment content. The reverse phase column was first equilibrated with solvent A -mix of water/acetonitrile (95:5, v/v) -for eight column volumes (CV), before injecting the extract, diluted in 2.5 ml of the same solvent, into a 5 ml injection loop. The separation method consisted of one CV of solvent A, followed by a gradient of nine CV, to reach 100% acetonitrile (solvent 0B). This solvent was maintained for a further 10 CV, and the entire run was carried out at a flow rate of 3 ml/min. We used an in-line connected Advion Expression compact mass spectrometer (CMS), which enabled product isolation guided by mass spectra. To evaluate the fragmentation of 16-OH-geranyllinalool, we additionally ran the extracts on UPLC-MS, allowing us to determine two main ions at m/z 271 and m/z 289, which we used to select our compound of interest on the Puriflash-CMS. We collected one fraction, which after evaporation contained 4.6 mg of sufficiently pure metabolite for NMR identification.

Transient Expression of 1-DEOXY-D-XYLULOSE-5-PHOSPATE SYNTHASE With GERANYLGERANYL PYROPHOSPHATE SYNTHASE and CASBENE SYNTHASE in Nicotiana benthamiana Can Produce Metabolites in Addition to Casbene
In the previous work, we tested the transient co-expression of different MEP pathway genes and GGPPS from A. thaliana with CAS from Jatropha curcas to evaluate the best combination for the highest production of casbene. We determined that the combination of DXS (catalyzing the first step in the MEP pathway), HDR (4-hydroxy-3-methylbut-2-enyl diphosphate reductase, catalyzing the last step), GGPPS, and CAS resulted in an up to 5-fold increase in casbene production, compared to CAS expression alone (Forestier et al., 2021). Omitting HDR from this combination resulted in lower production of casbene (Forestier et al., 2021). Further inspection of the total ion chromatograms of plant extracts from this reduced gene combination identified three additional peaks (Figure 1A), compared to co-expression of DXS, HDR, GGPPS, and CAS which only produced casbene ( Figure 1B).
The largest of the three additional peaks was present in sufficient amount to allow its identification as 16-hydroxy-casbene (16-OH-casbene) by NMR spectroscopy (Supplementary Figure S1). 16-OH-casbene was present at approximately 30% of casbene levels when HDR was absent from the gene combination (Supplementary Figure S2).
We considered therefore that 16-OH-GGPP could be formed if there were an excess of HMBPP, caused by the over-expression of DXS and GGPPS with insufficient conversion to DMAPP and/or IPP due to this being dependent on an endogenous HDR. 16-OH-GGPP could then be further incorporated into the casbene backbone assuming that CAS accepts 16-OH-GGPP as substrate for production of 16-OH-casbene ( Figure 2B).

Detection of GGPP and Putative 16-OH-GGPP in Planta by Transient Over-Expression of DXS + GGPPS and DXS + HDR + GGPPS
To establish whether 16-OH-GGPP accumulates depending on the gene combination, either DXS + GGPPS or DXS + HDR + GGPPS were transiently expressed in N. benthamiana and C 20 prenyl diphosphate intermediates were extracted as described by Nagel et al. (2014). UPLC-MS/MS negative mode analysis of methanolic extracts was used to detect GGPP by selecting the m/z range 449-450 and by comparison with an authentic GGPP standard (Figures 3A-C).
In the absence of a 16-OH-GGPP standard, we predicted that since m/z 449. We were unable to detect the m/z 465.2 peak in the gene combination of DXS + HDR + GGPPS (Figure 3E), consistent with the hypothesis that OH-GGPP is produced when HMBPP reduction is limiting due to lack of HDR activity.

Transient Expression in N. benthamiana of a GERANYLLINALOOL SYNTHASE From
Nicotiana attenuata (NaGLS) Results in Production of Both Geranyllinalool and 16-Hydroxy-Geranyllinalool When Co-expressed With DXS + GGPPS But Only Geranyllinalool When HDR Is Included in the Gene Combination To further explore whether 16-OH-GGPP could be used by other diterpene synthases, we transiently expressed the Nicotiana attenuata geranyllinalool synthase (NaGLS) in N. benthamiana, alone or in combination with DXS + HDR + GGPPS (Figure 4 and Supplementary Figure S3). This resulted in accumulation of geranyllinalool in both cases. Co-expression of NaGLS + DXS + GGPPS produced geranyllinalool but also two additional peaks with retention times (R t ) of 30.0 and  Figure S4) and flash chromatography to purify the compound giving rise to the larger peak at Rt 30.0 min in sufficient quantities to permit its identification by NMR spectroscopy as 16-OH-geranyllinalool (Supplementary Figure S5) This novel natural product represented up to 25% of the geranyllinalool peak ( Figure 4C). This is comparable amount of 16-OH-casbene to casbene (30%) with the same reduced gene combination (Supplementary Figure S2). Despite obvious parallels with the regio-isomeric 17-OH-geranyllinalool (Supplementary Figure S6), the precursor of the insecticidal diterpene glycosides in many Nicotiana species (Snook et al., 1997;Jassbi et al., 2010;Falara et al., 2014), there is no indication that 16-OH-geranyllinalool could be involved in the biosynthetic pathway to 17-OH-diterpenes. A recent work actually demonstrated that two cytochrome P450s from N. attenuata are responsible for the 17-hydroxylation of geranyllinalool (Li et al., 2021).
We did not detect any other hydroxy geranyllinalool compounds apart from 16-OH-geranyllinalool with the DXS + GGPPS + NaGLS gene combination, providing further evidence that 16-OH-geranyllinalool is derived from a direct conversion of 16-OH-GGPP.

DISCUSSION
This work provides evidence for the formation of 16-OH-GGPP when the flux through the MEP pathway in N. benthamiana is altered. Both casbene synthase from Jatropha curcas and geranyllinalool synthase from N. attenuata result in production of 16-hydroxylated versions of their usual diterpene products when transiently expressed in N. benthamiana producing 16-OH-GGPP. The detection of additional minor compounds from both enzymes suggests that additional products may also arise when 16-OH-GGPP is used as substrate. We hypothesize that 16-OH-GGPP is formed through the action of A. thaliana GGPP synthase when HMBPP levels are elevated due to increased flux through the MEP pathway and a bottleneck exists at the HDR step. When the HDR enzyme, which reduces the hydroxy group in HMBPP to make DMAPP, is co-expressed with DXS and GGPPS, neither 16-OH-GGPP nor 16-OH-diterpenes are detected. Taken together the evidence presented supports formation of 16-OH-casbene or 16-OH-geranyllinalool by promiscuous diterpene synthases acting on 16-OH-GGPP rather than P450-based hydroxylation of casbene or geranyllinalool. The fact that formation of these 16-hydroxylated compounds is exclusively associated with over-expression of the first step in the MEP pathway combined with omission of the last step points to the bottleneck at the HDR step giving rise to 16-OH-GGPP via a plausible route. Interestingly, in Escherichia coli, overproduction of HMBPP is cytotoxic and removal of this effect is achieved by activation of IspG, the gene encoding the  HDR equivalent in plants (Li et al., 2017). The transient expression approach we use in planta may have by-passed such regulation if indeed it is important in N. benthamiana. There are precedents from the terpenoid literature for the formation of more highly oxidized precursors, which are then accepted as alternatives to the normal substrate in a known biosynthetic pathway. Thus, 2,3-oxidosqualene, the usual precursor of triterpenes and sterols, can undergo a second oxidization by the endogenous squalene epoxidase to form dioxidosqualene, when it accumulates in yeast (Salmon et al., 2016). Research has shown that a mutated triterpene synthase actually prefers this double-oxygenated substrate to the normal 2,3-oxidosqualene, leading to the production of unusual triterpenes that incorporate an additional oxygen atom in the fifth ring (Salmon et al., 2016). There are also examples of synthetic chemistry work focusing on obtaining analogues of the sesquiterpene precursor farnesyl pyrophosphate (Dolence and Dale Poulter, 1996;Placzek and Gibbs, 2011) or even (Z,E,E)-geranylgeranyl pyrophosphate (Minutolo et al., 2006), demonstrating the interest of alternative substrates for terpenoid production.
It is perhaps unlikely that 16-OH-GGPP is a significant substrate in nature when the MEP and diterpene biosynthetic pathways are subject to their normal mechanisms of regulation. However, the substantial level of 16-hydroxylated diterpenes with native diterpenes synthases in transient expression systems might suggest that 16-OH-diterpenes can become more biologically relevant under abnormal circumstances, when such regulation is compromised.
In terms of engineering biology, this work demonstrates the importance of regulating flux through biosynthetic pathways to ensure intermediates do not accumulate as the promiscuity of substrate specificity can result in the production of unexpected end products. On the other hand, this example shows that the generation of a novel GGPP substrate can open the possibility of entirely new diterpenes that could be further modified and evaluated in terms of their bioactivity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
EF designed experiments, performed experiments, and analyzed data. GB, DH, and TL performed experiments, analyzed data, and contributed to the writing of the manuscript. EF and IG wrote the manuscript. IG contributed to the design and analysis of the study. All authors contributed to the article and approved the submitted version.

FUNDING
This research was funded by the BBSRC and Innovate UK under grant number BB/M018210/01. High-resolution mass spectrometry was performed using equipment within the Centre of Excellence in Mass Spectrometry (University of York), funded by Science City York (Yorkshire Forward, EP/K039660/1, EP/M028127/1).